275 research outputs found
Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems
Modern deep learning-based recommendation systems exploit hundreds to
thousands of different categorical features, each with millions of different
categories ranging from clicks to posts. To respect the natural diversity
within the categorical data, embeddings map each category to a unique dense
representation within an embedded space. Since each categorical feature could
take on as many as tens of millions of different possible categories, the
embedding tables form the primary memory bottleneck during both training and
inference. We propose a novel approach for reducing the embedding size in an
end-to-end fashion by exploiting complementary partitions of the category set
to produce a unique embedding vector for each category without explicit
definition. By storing multiple smaller embedding tables based on each
complementary partition and combining embeddings from each table, we define a
unique embedding for each category at smaller memory cost. This approach may be
interpreted as using a specific fixed codebook to ensure uniqueness of each
category's representation. Our experimental results demonstrate the
effectiveness of our approach over the hashing trick for reducing the size of
the embedding tables in terms of model loss and accuracy, while retaining a
similar reduction in the number of parameters.Comment: 11 pages, 7 figures, 1 tabl
Methods for Quantized Compressed Sensing
In this paper, we compare and catalog the performance of various greedy quantized compressed sensing algorithms that reconstruct sparse signals from quantized compressed measurements. We also introduce two new greedy approaches for reconstruction: Quantized Compressed Sampling Matching Pursuit (QCoSaMP) and Adaptive Outlier Pursuit for Quantized Iterative Hard Thresholding (AOP-QIHT). We compare the performance of greedy quantized compressed sensing algorithms for a given bit-depth, sparsity, and noise level
Optimizing quantization for Lasso recovery
This letter is focused on quantized Compressed Sensing, assuming that Lasso is used for signal estimation. Leveraging recent work, we provide a framework to optimize the quantization function and show that the recovered signal converges to the actual signal at a quadratic rate as a function of the quantization level. We show that when the number of observations is high, this method of quantization gives a significantly better recovery rate than standard Lloyd-Max quantization. We support our theoretical analysis with numerical simulations
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Shampoo is an online and stochastic optimization algorithm belonging to the
AdaGrad family of methods for training neural networks. It constructs a
block-diagonal preconditioner where each block consists of a coarse Kronecker
product approximation to full-matrix AdaGrad for each parameter of the neural
network. In this work, we provide a complete description of the algorithm as
well as the performance optimizations that our implementation leverages to
train deep networks at-scale in PyTorch. Our implementation enables fast
multi-GPU distributed data-parallel training by distributing the memory and
computation associated with blocks of each parameter via PyTorch's DTensor data
structure and performing an AllGather primitive on the computed search
directions at each iteration. This major performance enhancement enables us to
achieve at most a 10% performance reduction in per-step wall-clock time
compared against standard diagonal-scaling-based adaptive gradient methods. We
validate our implementation by performing an ablation study on training
ImageNet ResNet50, demonstrating Shampoo's superiority over standard training
recipes with minimal hyperparameter tuning.Comment: 38 pages, 8 figures, 5 table
ESRRB regulates glucocorticoid gene expression in mice and patients with acute lymphoblastic leukemia
Synthetic glucocorticoids (GCs), such as dexamethasone and prednisone, remain key components of therapy for patients with lymphoid malignancies. For pediatric patients with acute lymphoblastic leukemia (ALL), response to GCs remains the most reliable prognostic indicator; failure to respond to GC correlates with poor event-free survival. To uncover GC resistance mechanisms, we performed a genome-wide, survival-based short hairpin RNA screen and identified the orphan nuclear receptor estrogen-related receptor-beta (ESRRB) as a critical transcription factor that cooperates with the GC receptor (GR) to mediate the GC gene expression signature in mouse and human ALL cells. Esrrb knockdown interfered with the expression of genes that were induced and repressed by GR and resulted in GC resistance in vitro and in vivo. Dexamethasone treatment stimulated ESRRB binding to estrogen-related receptor elements (ERREs) in canonical GC-regulated genes, and H3K27Ac Hi-chromatin immunoprecipitation revealed increased interactions between GR- and ERRE-containing regulatory regions in dexamethasone-treated human T-ALL cells. Furthermore, ESRRB agonists enhanced GC target gene expression and synergized with dexamethasone to induce leukemic cell death, indicating that ESRRB agonists may overcome GC resistance in ALL, and potentially, in other lymphoid malignancies
The New Generation Atlas of Quasar Spectral Energy Distributions from Radio to X-rays
We have produced the next generation of quasar spectral energy distributions
(SEDs), essentially updating the work of Elvis et al. (1994) by using
high-quality data obtained with several space and ground-based telescopes,
including NASA's Great Observatories. We present an atlas of SEDs of 85
optically bright, non-blazar quasars over the electromagnetic spectrum from
radio to X-rays. The heterogeneous sample includes 27 radio-quiet and 58
radio-loud quasars. Most objects have quasi-simultaneous ultraviolet-optical
spectroscopic data, supplemented with some far-ultraviolet spectra, and more
than half also have Spitzer mid-infrared IRS spectra. The X-ray spectral
parameters are collected from the literature where available. The radio,
far-infrared, and near-infrared photometric data are also obtained from either
the literature or new observations. We construct composite spectral energy
distributions for radio-loud and radio-quiet objects and compare these to those
of Elvis et al., finding that ours have similar overall shapes, but our
improved spectral resolution reveals more detailed features, especially in the
mid and near-infrared.Comment: 46 pages, 10 figures, 10 tables, Accepted by ApJS. Composite SED data
files for radio-loud and radio-quiet quasars (rlmsedMR.txt, rqmsedMR.txt) are
included in the source (Other formats -> Source). Supplemental figures are
not include
Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species
Background: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another
- …